Gemma 3n is a lightweight and state-of-the-art open-source multimodal model family launched by Google, built on the same research and technology as the Gemini model. It supports text, audio, and visual inputs and is suitable for various tasks.
Image-to-Text
Transformers